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Abstract 

Retroviruses have been infecting mammals for at least 100 million years, leaving descendants in host genomes known as 
endogenous retroviruses (ERVs). The abundance of ERVs is partly determined by their mode of replication, but it has also 
been suggested that host life history traits could enhance or suppress their activity. We show that larger bodied species 
have lower levels of ERV activity by reconstructing the rate of ERV integration across 38 mammalian species. Body size 
explains 37% of the variance in ERV integration rate over the last 1 0 million years, controlling for the effect of confounding 
due to other life history traits. Furthermore, 68% of the variance in the mean age of ERVs per genome can also be explained 
by body size. These results indicate that body size limits the number of recently replicating ERVs due to their detrimental 
effects on their host. To comprehend the possible mechanistic links between body size and ERV integration we built a 
mathematical model, which shows that ERV abundance is favored by lower body size and higher horizontal transmission 
rates. We argue that because retroviral integration is tumorigenic, the negative correlation between body size and ERV 
numbers results from the necessity to reduce the risk of cancer, under the assumption that this risk scales positively with 
body size. Our model also fits the empirical observation that the lifetime risk of cancer is relatively invariant among 
mammals regardless of their body size, known as Peto's paradox, and indicates that larger bodied mammals may have 
evolved mechanisms to limit ERV activity. 
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Introduction 

Mammalian genomes contain large numbers of endogenous ret- 
roviruses (ERVs), derived from multiple independent germline 
invasions over evolutionary time. The human genome contains 
31-40 such ERV invasions, termed 'families', each derived from a 
distinct ancestral exogenous retrovirus [1,2]. These ERVs can con- 
tinue proliferating after the initial germline invasion until they are 
inactivated, either through the acquisition of substitutions that 
occur at the host background level (~ 10 _ i per base per my) or by 
recombinational deletion [3,4]. Most ERV families proliferate by 
reinfection, although some ERVs occasionally switch from rein- 
fecting germline cells to an entirely intracellular life, and this switch 
can lead to an increase in the size of the ERV family [5] . As a result 
of these processes, ERVs have come to occupy — 5—10% of their 
hosts' genomes [6,7]. 

The fixation of a new ERV insertion is influenced by its fitness 
consequences to the host, and other population genetic parameters 
[8]; for example a neutral ERV could fix by drift, and a slightly 
deleterious insertion may hitchhike or fix during a population 
bottleneck [9] . A small number of ERVs have been exapted and 
have beneficial functions in their host [10-12], but the integration 
of retroviruses into or near host genes can have highly deleterious 
effects, as the consequent disruption or alteration of gene expression 



can lead to malignant transformation [13]. Furthermore, illegiti- 
mate recombination between ERVs at different loci can also have 
deleterious effects, as can the expression of viral proteins. The 
uncontrolled proliferation of ERVs would therefore be extremely 
detrimental to their host [ 1 4] , and this process must be limited either 
by cessation of replication activity, or by host mediated suppression 
[15] Vertebrate genomes have evolved a range of responses that act 
at various stages of the viral life cycle to limit retroviral replication 
and its associated tumorigenic potential [16,17]. 

The diversity and activity of ERVs across mammalian genomes 
has not been systematically assessed, and it remains unclear what 
factors have determined ERV abundance in their hosts. Mice and 
humans, the first two mammalian species to have their genomes 
sequenced, show strikingly different patterns of ERV activity - most 
human endogenous retroviruses are inactive, widi a striking decel- 
eration in activity over the last 25 million years [7]. In contrast, the 
mouse genome shows no sign of deceleration in ERV activity and a 
large number of murine ERVs are active and unfixed in the mouse 
population [6] . This difference is also reflected in the proportion of 
catalogued mutant alleles that are due to ERV insertions; ~ 10% of 
mutant alleles can be attributed to ERVs in mice, whereas no such 
alleles can be attributed to ERVs in humans [18]. It has been sug- 
gested that the markedly different ERV activity in human and mouse 
genomes can be explained by systematic factors in the biology of 
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Author Summary 

Retroviruses have been invading mammalian genomes for 
over 100 million years, leaving traces known as endoge- 
nous retroviruses (ERVs). Early genome sequencing studies 
revealed a marked difference in the activity of retroviruses 
among species, with humans largely containing inactive 
lineages of ERVs, while the mouse contains numerous 
lineages of active ERVs. We explore the hypothesis that life 
history traits determine the activity of ERVs in mammalian 
genomes, and show that larger mammals have fewer ERV 
copies over recent evolutionary time (the last 10 million 
years) compared to smaller mammals. This association is 
determined by body size independently of any confound- 
ing variables. We build a mathematical model that shows 
that ERV abundance in genomes decreases with larger 
body size and increases with horizontal transmission. 
Retroviral integration can cause cancer, and our analysis 
suggests that larger bodied animals control ERV replication 
in order to postpone cancer until a post-reproductive age. 
This is in line with a long-standing observation that cancer 
rates do not fluctuate among mammals of different body 
size, a phenomenon known as Peto's paradox, and opens 
up the possibility that larger animals have evolved 
mechanisms to limit ERV activity. 



these hosts [14], We explore the hypothesis that differences in ERV 
activity across mammals are determined by differences in host life 
history, with smaller bodied animals expected to have higher levels 
of ERV activity. We compare body size with ERV numbers using 
data from a diverse set of 38 mammals in a multivariate analysis, 
controlling for confounding variables such as life history traits. We 
also explore the effect of body size and horizontal transmission on 
ERV dynamics through a mathematical model. Finally, we discuss 
our associations of body size and ERV replication in the light of 
evolutionary theory and cancer biology. 

Results 

Body size has a negative correlation with ERV abundance 
across mammals 

By analysing 38 mammalian genomes over approximately the 
last 10 my period we find a negative relationship between the 
number of integrated ERVs and body size (Figure lb, Figure 2a). 
The correlation is robust if instead of present day body size we use 
the reconstructed body size at 5 million years ago (P = 0.0069, 
R 2 = 0.31), and remains significant if we use a single substitution 
rate for all mammals (P = 0.01, R 2 = 0.25). The correlation is depen- 
dent on the age of the integration in the genome and is no longer 
significant when we consider ERVs that are older than 10 my 
(Figure lc, Figure Id). If we exclude ERVs that belong to our previ- 
ously defined megafamilies [5] with mean divergence < 10%, namely 
the IAP family from Cavia Porcelus, the IAP family from Dipodomys 
Ordii, a Class I family from Felis catus, and the IAP and ERV-L 
families from Mus Musculus, the correlation remains (P = 0.0042, 
R 2 = 0.30). If we split ERVs into their traditional classes, the 
correlation is significant only for the class II ERVs (Figure 3). There 
is no data to suggest systematic differences in the biology of 
retroviruses from different classes, given that the majority of ERVs 
are derived from extinct retroviral lineages. The three classes differ 
in their age distribution; class II ERVs are much younger (Figure 4), 
with the majority of insertions falling within the 10 my era that we 
use to define the young age category. Thus the observed relationship 
between body size and Class II ERVs is likely due to their recent 



replication and not some other difference in their biology such as 
higher pathogenicity. 

The correlation of body size with ERV abundance is not 
confounded by another life history trait 

Since life history traits are correlated with each other, it is pos- 
sible that the apparent and inferred correlation of ERVs with body 
size could be confounded by another trait such as reproductive 
output (for which gestation period is a proxy) and timing (age at 
sexual maturity) [19]. The number of mates or the type of placenta 
might also influence ERV abundance via an increased risk of hor- 
izontal or vertical retroviral transmission, respectively. To clarify if 
number of sexual partners has played a role in determining the 
number of ERVs per genome, we use testis size as a proxy as it is 
known to correlate with the number of mates and the strength of 
sexual selection in mammals [20,21]. To assess the effect of the 
placental type we modeled placental invasiveness as a semiquan- 
titative parameter (i.e. marsupials = 1 , epitheliochorial = 2, endo- 
theliochorial = 3 and hemochorial = 4) [22]. We evaluated the 
correlation between ERV integration rate and potential con- 
founders with multivariate models and standard stepwise forward 
model selection. We included in turn the following confounding 
variables; time to sexual maturity, gestation period, life span, testis 
size and placental invasiveness. Body size remained as the only 
significant variable confirming that it is the only significant predictor 
of ERV integration rate over the last 10 my (Table 1). The models 
remain significant when we account for phylogenetic non-indepen- 
dence [23], reconstruct ancestral mass and/or incorporate a body 
mass dependent substitution rate. Thus, unlike substitution rate 
[24,25], ERV integration rate is not a result of shorter generation 
time. We do not find a significant correlation with testis size, either 
as an additional predictor variable (P = 0.3) with body size included 
in the model, or as an interaction term with body size (P = 0.2). 
Thus, the number of mates does not appear to have played a 
significant role on the number of ERVs per genome. 

Another possible confounder is the effective population size of 
the host [26]: species with higher effective population sizes are 
expected to be more efficient at purging slightly deleterious muta- 
tions such as those incurred by ERV proliferation [27]. As a result, 
since larger bodied animals have smaller effective population sizes 
[19,28], we would expect them to have more, not fewer ERVs. 
Thus, confounding due to effective population size would lead to a 
correlation in the opposite direction to what we observe, indicating 
that the observed correlation between body size and ERV 
numbers is robust against variations in effective population size. 

Relationship between ERV abundance and body size can 
be explained by a mathematical model of retrovirus-host 
dynamics 

To explore the possible mechanistic links between body size, 
integration rate and transmission route, we designed a mathemat- 
ical model (Figure SI). We constructed a compartmental mathe- 
matical model using a system of ordinary differential equations to 
describe the epidemiological dynamics of exogenous and endog- 
enous retroviral infections. There are two broad classes of indivi- 
duals that need to be considered, the susceptible population (S) 
and the infected population. In order to gain a more detailed 
picture of the latter compartment, namely to elucidate the inter- 
connected roles between exogenous and endogenous retroviral 
infections, we further distinguish three infected sub-populations: 
individuals infected with an exogenous retrovirus (I ), those 
infected with a single integrated copy of the retrovirus through the 
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Figure 1. (a) Correlation between mean age of all ERV integrations and body mass from the genomes of 38 mammals. Body mass is 
log-transformed, and the mean ages are calculated correcting for the substitution rate (R 2 = 0.68, P<0.001). (b, c, d) The relationship between ERV 
count and body mass for the number of ERV integrations acquired over the last 10 my, between 10-35 mya and >35 mya in the genomes of 38 
mammals (both values log-transformed). The trend lines representing the slope for the regression, corrected for phylogenetic non-independence, 
and accompanying P-values are plotted. We have taken into account the effect of body size on substitution rate in calculating the ages. 
doi:10.1371/journal.ppat.1004214.g001 



process of endogenisation (I ERV ), and lastly, those infected with 
an endogenous retrovirus that has undergone amplification (l AERV ^j. 

The overall level of retroviral activity is directly related to the 
copy number of endogenised retroviruses in the infected popula- 
tion. Since the vast majority of endogenous retrovirus present in 
the host population persists in the pool of y4£i?F-infected indivi- 
duals, the level of retroviral activity can be represented by the 
magnitude or size of this compartment. We first explore the roles 
of three key factors: body size (5) the rate of retroviral endog- 
enisation ( a) which governs vertical transmission of the retrovirus, 



and the force of infection (X) which determines the rate of hori- 
zontal transmission of the retrovirus. As shown in Figure S2 
increased body size results in a lower number of individuals har- 
bouring amplified endogenous retrovirus when this system reaches 
equilibrium, (Figure S2, upper plot), while the rate of retroviral 
endogenisation (c) and the force of infection (/.) display the 
opposite relationship (Figure S2, lower plot). 

Our model demonstrates that horizontal and vertical transmis- 
sion are both crucial for the eventual endogenisation and ampli- 
fication of the retrovirus. If there is no horizontal transmission (i.e. 
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Figure 2. (a) Correlation between ERV count and body mass for the number of ERV integrations acquired over the last 10 my. (b) 

Correlation between mean age of all ERV integrations and body mass from the genomes of 38 mammals. Body mass is log-transformed, and the 
mean ages are calculated correcting for the substitution rate. We have not taken into account the effect of body size on substitution rate in 
calculating the ages. 
doi:1 0.1 371 /journal.ppat.1 00421 4.g002 



A = 0), then the retrovirus cannot spread and persist in the popu- 
lation, even if there is a large initial pool of infected individuals 
(figure 5). Similarly, in the situation where no endogenisation events 
occur via vertical transmission (i.e. c = 0), our model shows that 
infection can become endemic, but remains completely exogenous. 

We explored the impact of body size on the structure of the 
population at equilibrium in relation to the extent of retroviral 
endogenisation a and the force of infection X (Figure 5) in more 
detail. The results in Figure 5 illustrate that higher rates of hori- 
zontal transmission, represented by the force of infection (a), lead 
to a higher proportion of AERV infections for a given body size, 
and furthermore, highlight our finding that larger body size (B) is 



associated with a lower extent of AERV activity, with all other para- 
meters fixed (Figure S2, lower plot). Furthermore, we observe from 
Figure 5 that for sufficiently high rates of horizontal transmission, 
the proportion of AERV infections plateaus with respect to increas- 
ing body size (B). In this case, larger body size is associated with a 
greater proportion of exogenous infections, as new (exogenous) infec- 
tions would arise through horizontal transmissions at a faster rate 
than endogenisation via vertical transmissions. To explore the possi- 
bility that the number of horizontal transmissions confound the 
number of elements per genome, we tested if the number of families 
per genome is correlated with body size, and find no significant 
correlation (P = 0. 1 5, R 2 = 0.08). 




Figure 3. Correlations of number of ERVs against body mass (both log-transformed) by ERV class. 

doi:1 0.1 371 /journal.ppat.1 004214.g003 
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Discussion 

We identified 84,223 ERVs, of which 27,71 1 have integrated in 
the last 10 my across 38 species of mammal (Table 2). We find that 
the number of ERV integrations in mammals is negatively correlated 
to body size. This correlation can explain 37% of the variance in 
the number of ERV integrations over the past 10 my. We have 
controlled for confounding variables such as life history and sexual 
selection, and also confirmed robustness to variation in effective 
population size. Nevertheless body size can be influenced by other 
parameters, and it is possible that other factors (e.g. environmental, 
dietary) contribute to both body size and ERV abundance, thereby 
explaining part of the remaining variance; for example they might 



account for the residual variance of outliers (e.g. Dasypus 
Novemcinctus and Canis familiaris). Interestingly, Microcebus 
murinus, whose life his- 
tory evolved rapidly due to its isolation in Madagascar [29], might 
be expected to be a significant outlier in the correlation, but is 
very close to the regression line. Perhaps the global distribution and 
geographic isolation of a species is another determinant of the 
variance in ERV abundance. 

We also see that 68% of the variance observed in the mean age 
of ERV integrations in a genome (a proxy for recent replication) is 
explained by body size (Figure la), with the number of young (i.e. 
recently replicating) insertions correlating inversely with body size 
(Figure 1 b) while the number of older insertions do not (Figure 1 c, 



Table 1. Phylogenetically corrected correlations of number of ERVs per genome acquired over the last 10 my (log) against life 
history traits (LHT) confounders. 
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d). These observations suggest that body size limits the number of 
ERVs in a genome and that the presence of recently replicating 
ERVs is detrimental to their host. As ERVs accumulate host- 
induced mutations over time, their activity diminishes until they 
eventually become neutral. Our study suggests that ERVs that 
have been active within the last 10 million years could still have 
moderately deleterious effects, probably in the post reproductive 
age. Furthermore, the age of an ERV is a good proxy for its path- 
ogenic potential, with pathogenicity decreasing over time. Some 
ERV families have retained replication capacity for millions of years; 
HERV-K (HML2) first invaded the primate genome >30 my yet 
has still been active up until at least ~500,000 years ago [30,31]. 



These recently active ERVs may retain some level of virulence, and 
therefore still have the potential for malignant transformation [13]. 
In line with this prediction of intermediate virulence, reconstructed 
ERVs [32] or recently established present day ERVs [33,34] have 
low but detectable viral loads. The presence of pathogenic ERVs in 
a genome after such a long period of time may appear surprising. It 
could however be explained by analogy to models of the transmis- 
sibility of pathogens within the context of host-parasite co-evolu- 
tionary dynamics [35,36]. Such models incorporate the effects of 
both transmissibility and virulence on the reproductive success of a 
parasite, and show that they do not necessarily evolve to be harm- 
less; in some empirical datasets reproductive success is maximised 
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at intermediate levels of both of these parameters [35]. In other 
words a pathogen can continue to be virulent despite selection 
imposed by the host for a more benign infection. 

We have modelled the spread of retroviruses among hosts and 
within genomes, distinguishing between exogenous and endoge- 
nous retroviruses, and taking into account vertical and horizontal 
modes of transmission. A key aspect of our model is the assump- 
tion that the deleterious effects of a retrovirus in a genome scale 
with body size. The model shows that as body size increases, the 
proportion of individuals in the host population that carry ERVs 
drops (figure S2). Elevated rates of either endogenisation or hori- 
zontal transmission lead to higher ERV abundance and accelerate 
the rate at which ERV abundance increases with body size. For 
any given rate of horizontal transmission however the overall rela- 
tionship between body size and ERV abundance is maintained 
(figure 5). In our model, the body size-associated pathogenic effect 
of an ERV in a genome is equivalent, whether it has been generated 
by vertical or horizontal transmission. Horizontal transmission of an 
ERV would require somatic expression and replication of the virus 
in order to propagate effectively, which may in turn increase the 
mortality of the host via a direct result of retroviral infection. Experi- 
mental evidence suggests that infections with replication competent 
retroviruses are more pathogenic when retroviral replication is high 
(e.g. HIV, or the recently endogenised Koala retrovirus [34,37]). 
One way in which the pathogenicity of an ERV can be reduced 
while its replicative capacity is maintained is through epigenetic 
regulation in somatic cells. During genomic reprogramming of the 
germline, transposable elements are expressed and can replicate 
before being silenced [38,39], resulting in lower levels of expression 
in somatic tissues and hence lower transmissibility. Thus, low levels 
of replication in somatic cells may be favorable for an ERV, enabling 
it to maximize its own success via vertical transmission while min- 
imizing harm to the host. The association between ERV abun- 
dance and body size indicates that somatic replication cannot be 
completely suppressed and that the pathogenetic effects of ERVs 
cannot be dissociated from their copy number. 

On a macroevolutionary timescale, ERV copy number will be 
determined both by the number of cross species transmissions and 
the subsequent proliferation of ERV families. The number of families 
per genome is orders of magnitude lower than the number of ERVs 
(mean number of families = 23, mean number of elements = 1073), 
and most ERVs within a genome come from a small number of 
families, the so-called superspreaders (or megafamilies) [5] . In line 
with this uneven distribution of ERVs among families, we do not see 
a correlation between the number of families within a genome and 
body size (P = NS, R 2 = 0.08). Furthermore, ERVs that belong to 
megafamilies lack the env genome that is required for horizontal 
transmission, highlighting the importance of vertical transmission in 
determining ERV abundance, despite the ability of ERVs to cross 
species on timescales spanning millions of years. 

Crucially, according to our model the selective cost of an ERV 
is determined by the body size of the host. Larger bodied animals 
would be expected to have a higher lifetime risk of cancer as a con- 
sequence of having both more dividing cells and longer lifespans. 
No such association is observed in nature, with relatively invariable 
risks of cancer in animals with differing body sizes, a phenomenon 
known as Peto's Paradox [40,41]. Under our model, the risk of 
retrovirally induced cancer also scales similarly with body size. The 
observed negative correlation between body size and ERV integra- 
tion rate suggests that larger mammals attain a lower ERV virulence 
cost per body size unit by reducing the number of ERVs in their 
genome. This should therefore enable them to postpone the onset of 
cancer until after their reproductive age. 
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Our results indicate that larger animals exert greater control 
over ERV proliferation. This could be due to the evolution of 
mechanisms capable of limiting retroviral activity and consequently 
limiting the incorporation of ERVs in the genome. Such mecha- 
nisms could involve the enhancement of innate or adaptive responses 
to retroviruses [16,17], or perhaps epigenetic regulation [42] is more 
potent in larger mammals. An intriguing alternative is that the effect 
is indirect via an improved immune surveillance - some genes 
involved in pattern recognition for defence against pathogens such 
as viruses are also involved in controlling cancers [43]. Antiviral 
genes are the result of a continuous and ancient arms race between 
viruses and their hosts [15], and elucidating their roles in controlling 
cancer across animals of different body size could provide insights 
into cancer susceptibility. 

Materials and Methods 

ERV mining and dating of insertions 

Our mining of the 38 mammal genomes has been described 
previously [5,44,45]. We estimated age based on the divergence from 
the most similar other ERV insertion in the same genome ("nearest 
neighbour"). We favour this approach over cruder metrics that are 
based on divergence from a consensus sequence, as it takes into 
account the phylogeny of the ERVs, and over approaches based on 
divergence between paired LTRs due to the variable quality of the 
genomes being analysed, most of which do not contain contigs that 
are long enough to include complete proviral elements. We first 
calculated nucleotide divergence from the most similar other ERV 
insertion in the same genome as described in Magiorkinis et al. [5] , 
and then converted this to an integration date assuming a mean 
nucleotide substitution rate at neutral nuclear protein coding sites 
in mammals of 2.2 x 10 9 per site per year [46], and corrected for 
multiple hits using the Jukes-Cantor model. To calculate the aver- 
age age of ERVs in each genome we took into account the known 
effect of body size on substitution rate by using a regression of rate 
against mass with slope of —0.09, i.e. log(adjusted rate) = 0.09 x 
(log(mean mass)— log(mass))+log(unadjusted rate) [47]. We also 
repeat the correlation between body size and ERV number with a 
single substitution rate for all mammals. 

Incorporating ancestral body mass 

Using the data above we reconstructed ancestral body masses 
assuming a Brownian motion model of trait evolution as imple- 
mented in the package GEIGER in the R language [48]. This 
program returns the estimated body mass at nodes in the tree, and 
from these we calculated values at the mid-points of our time 
intervals (averaging where necessary). We then manually pruned 
our trees to this point and repeated the regression between number 
of ERVs and body mass for each time interval, taking the phy- 
logeny into account (Table 2). Our regressions were performed 
with both present day body size and the reconstructed body size at 
the mid-point of our time intervals (e.g. body size at 5 million years 
ago for regression against activity during the last 10 million years). 

Multivariate analysis 

Life history traits correlate with each other; for example larger 
bodied animals tend to live longer and have smaller effective 
population size [19,28]. Therefore body size could in principle be 
a surrogate measure of a different life history trait, as has been pre- 
viously shown for substitution rate [24]. Mammalian life history 
data was taken from [49] and the phylogenetic tree from [50] . We 
collected the testis size for 24 out of 38 species in our study (Table 
SI). We used the Generalized Least Squares (GLS) approach as 
implemented by the Analysis of Phylogenetics and Evolution 



(APE) package [51] in R. We used standard model selection to 
identify significant confounders of ERV numbers per genome 
(Table 1). 

A mathematical model of ERV persistence and evolution 

Model (1) captures the fundamental dynamics of retroviral 
infections including the processes of retroviral endogenisation and 
amplification. The key interactions of the model are illustrated 
schematically in Figure SI. 



dS 
dt : 



new susceptible births transmission from RV +AERV infected 
death of susceptibles 



di Rv 
dt 



new infections from from RV + AERV 

'XT 



death of RV infected 



A jERV 

dt 



births with integrated ERV 

7^ 



amplification 



death of ERV infected 



d jAERV 

dt 



births of AERV by AERV infected amplification 

death of AERV- infected, background and cancer-induced 



(i) 



(U + H AhKV (B))I 



j AERV 



where b = fi[N-aI RV -I AERV ] + fi AERV (B)I AERV and X = 
PJ RV /N+p 2 I AERV /N. 

In model (1), we consider both vertical and horizontal routes of 
transmission. We also distinguish between exogenous and endog- 
enous retroviral infections. Whereas horizontal transmission can 
only lead to infection with an exogenous retrovirus (i.e. RV com- 
partment), vertical transmission can result in retroviral endogen- 
isation (i.e. ERV compartment) and subsequent amplification (i.e. 
AERV compartment). 

There are two ways in which new (exogenous) retroviral infec- 
tions may arise horizontally in an initially susceptible individual, 
either through contact with an individual infected with an exog- 
enous retrovirus (i.e. RV compartment), or alternatively via 
exposure to an individual infected with an amplified endogenous 
retrovirus (i.e. AERV compartment). We assume that individuals 
infected with only a single integrated copy of the retrovirus (i.e. ERV 
compartment) are unable to transmit the infection horizontally 
between hosts. The force of infection X is composed of two terms, 
X = X\+X 2 , and thus reflects the dual modes of horizontal 
transmission. There are various different functional forms for the 
force of infection, and we choose the commonly used form 
X = fi\I RV /N + p 2 I AERV /N, where ft and ft are the respective 
coefficients of infectious trans- 

missibility for RV- infected and AERV-mitcleA individuals, and N is 
the total population which is assumed to be constant. 

A small proportion <7, where 0< a< 1, of births from individuals 
who are infected by an exogenous retrovirus acquire an integrated 
endogenous copy of the retrovirus, thereby entering the ER V com- 
partment. Meanwhile, individuals infected with an integrated endog- 
enous retrovirus (in the ERV compartment) undergo retroviral 
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amplification at a rate a(B), which is dependent on body size (B). A 
consequence of retroviral amplification is a greater number of 
endogenous retroviruses, therefore the size of the compartment of 
individuals harbouring amplified, endogenised retroviruses is an 
indirect measure of the overall extent of retroviral activity. Births 
arising from infected individuals with amplified, endogenous retrovirus 
(i.e. AERV compartment) themselves harbour amplified, endoge- 
nous retroviruses. 

To investigate the system without unnecessarily over-complicat- 
ing the dynamical behaviour of the model, we consider a population 
that is maintained at a fixed size (N > 0) so that (S + I s v + I ER v + 
I AERr ) = N. The pool of susceptible individuals is maintained by 
the birth of new susceptible individuals and is encapsulated in the 
term b in the S equation, which includes new births of susceptibles 
from all other compartments as well as a term to balance the in- and 
out-flux of individuals in the system and ensure that the total 
population remains constant. We assume that background birth and 
death rates in each compartment are equal at a constant rate fi. 
Additional mortality due to the detrimental effects of amplified, 
endogenous retroviral infection, such as the development of cancer, 
is reflected in the parameter fi AERV (B), which depends on body size 
(B). Excess mortality as a consequence of cancer is fed back into the 
susceptible pool so that, therefore the birth of susceptible individuals 
can be encapsulated by the term b, where b = fi [N — oI rv — l AERr ] 
+ H AERV (B)I AERV 

The above discussion highlights an important trade-off between 
retroviral amplification cc(B), which is beneficial to the long-term 
persistence of the retrovirus, and increased mortality [i(B) in excess 
of background death rates as a consequence of the detrimental 
effects associated with increased retroviral activity. These two fac- 
tors both depend on body size (B), but in opposing ways. Whereas 
larger body size means increased retroviral amplification, it also 
results in greater mortality so that both a(B) and fi AERV (B) are 
increasing functions of B. We therefore investigate the role of body 
size (B) on the outcome of infection. Several additional parameters 
of significance are the force of infection X as well as the rate of ret- 
roviral endogenisation a and how varying body size can influence 
the dynamical behaviour of the infection according to model (1). For 



the former, we explore how body size can affect the system when 
differences between the force of infection (X) of individuals infected 
with exogenous retrovirus (i.e. the I RV compartment) versus those car- 
rying amplified, endogenous retrovirus (i.e. the I AERV compartment) 
are taken into account. In terms of the latter, it is expected that a 
higher rate of endogenisation would result in a greater proportion 
of individuals with integrated endogenous retroviruses, and we are 
interested in determining the role of body size with respect to 
differences in endogenisation rates. Because we have assumed that 
the total population remains constant, it is sensible to investigate the 
dynamics of the model with respect to proportions of the total 
population rather than in terms of the sizes of each compartment. 

Supporting Information 

Figure SI A schematic diagram of the model representing the 
interactions among four distinct subpopulations: susceptibles (S), 
infected with (exogenous) retrovirus {I RV ), infected with integrated 
(endogenous) retrovirus iI ERV ), and infected with amplified inte- 
grated (endogenous) retrovirus iI AERV ). 
(EPS) 

Figure S2 The results of model (1) show that the proportion of 
the population infected with amplified, endogenous retrovirus (i.e. 
the AERV -compartment) is associated with a larger body size (B), 
and lower rates of endogenisation [a) and force of infection (X). The 
model also predicts that a higher rate of retroviral endogenisation 
(a) and a greater force of infection (X) are both linked to a shorter 
time to reach the endemic steady state. 
(EPS) 

Table Si Testis size for 24 species. 
(DOCX) 
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